Reduction of English function words in switchboard
نویسندگان
چکیده
The causes of pronunciation reduction in 8458 occurrences of ten frequent English function words in a four-hour sample from conversations from the Switchboard corpus were examined. Using ordinary linear and logistic regression models, we examined the length of the words, the form of their vowel (basic, full, or reduced), and final obstruent deletion. For all of these we found strong, independent effects of speaking rate, predictability, the form of the following word, and planning problem disfluencies. The results bear on issues in speech recognition, models of speech production, and conversational analysis.
منابع مشابه
Active Learning for Dialogue Act Classification
Active learning techniques were employed for classification of dialogue acts over two dialogue corpora, the English humanhuman Switchboard corpus and the Spanish human-machine Dihana corpus. It is shown clearly that active learning improves on a baseline obtained through a passive learning approach to tagging the same data sets. An error reduction of 7% was obtained on Switchboard, while a fact...
متن کاملThe effect of language model probability on pronunciation reduction
We investigate how the probability of a word affects its pronunciation. We examined 5618 tokens of the 10 most frequent (function) words in Switchboard: I, and, the, that, a, you, to, of, it, and in, and 2042 tokens of content words whose lexical form ends in a t or d. Our observations were drawn from the phonetically hand-transcribed subset [1] of the Switchboard corpus [2], enabling us to cod...
متن کاملRestricted cascade and wreath products of fuzzy finite switchboard state machines
A finite switchboard state machine is a specialized finite state machine. It is built by binding the concepts of switching state machines and commutative state machines. The main purpose of this paper is to give a specific algorithm for fuzzy finite switchboard state machine and also, investigates the concepts of switching relation, covering, restricted cascade products and wreath products of f...
متن کاملMulti-Speaker Language Modeling
In conventional language modeling, the words from only one speaker are represented at a time, even for conversational tasks such as meetings and telephone calls. In a conversational or meeting setting, however, different speakers can influence each other. In order to recover this missing inter-speaker information, in this work we present a novel approach for conversational language modeling tha...
متن کاملSVitchboard II and fiSVer i: high-quality limited-complexity corpora of conversational English speech
In this paper, we introduce a set of benchmark corpora of conversational English speech derived from the Switchboard-I and Fisher datasets. Traditional ASR research requires considerable computational resources and has slow experimental turnaround times. Our goal is to introduce these new datasets to researchers in the ASR and machine learning communities (especially in academia), in order to f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998